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Abstract 
optimal focus in an optical microscope is based on a clarity-evaluation function that is applied to images acquired from different 


Autofocusing is a fundamental step towards automated microscopic screening of Caenorhabditis elegans. Determining the 


focuses of the same field. The maximum value of the function is considered as the point of optimal focus. In this paper, 16 autofocus 
algorithms which were collected from well-known algorithms as well as the most recently proposed focusing algorithms have been 
evaluated. Through these evaluations, an optimal algorithm was found for C. elegants lipid droplets to set up an automatic screening 
system. Many features were assessed in this paper, for instance accuracy, computational time, addition of noise, and focusing curve. 
Our results have shown that most of the algorithms show an overall high performance for this type of image, and absolute Tenengrad 


algorithm will be our first choice for its best performance considering accuracy. 
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Caenorhabditis 
established model organism for studying lipid droplets 


elegans (C. elegans) is an 


and energy metabolism because it has many 
advantages. For example, its mutation is simple to use 
as genetic tool, and it is easily to be examined in 
microscope ". In the past few years, owing to the 
identification of lipid droplet-associated proteins such 
as DHS-3™ and PLN”, lipid droplets can be marked 
with fluorescence protein-tagged as a fat storage 
RNA (RNAi) has 
investigated in C. elegans since 1998 by Fire et al. 
(Nobel Prize in Physiology or Medicine in 2006) ™. 
Through an RNAi screen, uncharacterized fat storage 


indicator. interference been 


regulatory genes can be identified in C. elegans P **. 
However, manual identification and counting of 
C. elegans lipid droplets is exhausting and time- 
consuming. Moreover, it requires a trained operator, 
and presents a high false-negative rate. Therefore, in 
order to reduce this rate and speed up the process, 
automatic screening is necessary. 

Focusing is a fundamental and crucial step in 
automatic system. Determining the optimal focus in an 


C. elegants lipid droplets, autofocus algorithms, automatic screening 


optical microscope is based on a clarity-evaluation 
function that is applied to images acquired from 
different focuses of the same field’. The maximum 
value of the function is regarded as the point of 
optimal focus. Many autofocus algorithms have been 
proposed in the literature, but their accuracy can vary 
depending on contents of the processed images. 
According to the previous studies, Santos et al. ©! 
compared autofocus algorithm in molecular cytogenetic, 
and drew the conclusion that the method Vollath-4 is 
the most appropriate for FISH (fluorescence in situ 
hybridization) images. Osibote et al." determined that 
method Vollath-4"' has the best focus accuracy for 
tuberculosis in bright-field microscopy. However, 
other studies such as Kimura et al. "” found that 
method variance provided the best overall performance 
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for tuberculosis. Liu et al."”! drew the same conclusion 
in both blood smear and pap smear. Redondo et al." 
claimed that variance, normalized variance and 
Vollath-5 are the most suitable method for automatic 
system in bright-filed microscopy pathology. 
Furthermore, the study by Mateos-Pérez et al."®” 
included additional assessing features such as 
variability of the fluorescent microscopy images, 
addition 


prefiltering processes. 


of noise, illumination changes, and 

This paper focuses on the ascertainment of the 
optimal autofocus algorithms for C. elegans lipid 
droplets fluorescence imaging through a systematic 
evaluation of 16 commonly-used autofocus algorithms. 
A set of criteria, such as accuracy error, computation 
time, the number of local maxima, FWHM (full width 
at half maximum) of the focus curve and the noise 
robustness, were assessed to evaluate the performance 
of the autofocus algorithms. In the previous literature, 
optimally focused image was identified by trained 
operators, which will lead to a little uncertainty and 
inaccuracy, because different observers could choose 
slightly different images around the optimal focus 
plane. For example, Osibote et al." determined that 
method Vollath-4 has the best focus accuracy for 
tuberculosis, whereas Kimura et al. “! found that 
method variance provides the best overall performance 
for tuberculosis. Therefore, we adopted an objective 
way to determine the optimal focus plane in this paper 
to avoid subjective deviation. 


1 Materials 


Strains Pvha-6::dhs-3::GFP (single copy) were 
created by professor Ho Yi Mak (Hong Kong 
University of Science and Technology). DHS-3 can be 
used as a lipid droplet marker protein in C. elegans, as 
well as a fat storage indicator in live worms. This 
marker protein will facilitate further mechanistic 
studies of lipid droplets in C. elegans '". All strains 
were maintained on NGM plates under standard 
conditions”. 

Worms were raised at approximately 22°C and 
prepared for imaging as described previously "®, 
Briefly, the worms were soaked in a solution of 0.1% 
tricaine and 0.01% levamlsole (Sigma-Aldrich) in M9 
for 20 ~ 30 min prior to imaging". The immobilized 
worms were then transferred with a glass hook to a 
slab of 3% agarose in M9. The coverslip was then 
sealed with Vaseline. 


Images were acquired using a motorized Zeiss 
Imager M2 Microscope equipped with AxioCam CCD 
camera (Carl Zeiss, Germany) and X-Cite 120Q light 
Canada) 
Axiovision (Carl Zeiss, Germany ). Green fluorescence 


source (Lumen Dynamics, driven by 
images were acquired with the resolution of 1388 x 
1040 pixels and 16bits of dynamic range in grayscale. 
A trained operator selected the best focal plane from 
which 30 images were captured upward in axrzial 
direction and another 29 downward, thus the stacks are 
made of 60 images. Two different magnifications were 
used: x10 (NA =0.30) and x40 (NA =0.75). Different 
magnification used different Z step: AZ=0.5 um at x40, 
and AZ=2 um at x10. Ten stacks were captured with 
two different magnifications each from ten different 
worms. All algorithms implemented in Matlab 7.6.0 
(The Mathworks, America) on an Intel Core i5 
3.50 GHz 16 GB RAM computer using the Windows 8 
operating system (Microsoft, America). 


2 Autofocus methods 


Autofocus is characteristic of automatic system. 
There are two kinds of autofocus methods: active 
methods and passive methods. Active methods are 
based on measuring the distance between the lens and 
object of the scene by emitting ultrasonic or infrared 
waves!'!, But these methods have limitations in case of 
live model organisms such as C. elegans. Passive 
methods are grounded on analyzing the image 
sharpness of the objects by autofocus functions, which 
are usually related to high frequencies of the image. 

The autofocus function gives a mathematical 
value that shows the degree of focus for each image of 
the same sample. The fundamental assumption behind 
most of the functions is that a defocused image results 
from convolution of the image with a certain point 
spread function (PSF), which usually produces a 
decrease in the high frequencies of the image ®!, This 
result can also be regarded as the assumption that 
focused images contain more information and detail 
than defocus images. The sixteen autofocus functions 
analyzed in this study can be classified into five 
groups: (1) Derivative-Based function, (2) Transform- 
Based function, (3) Statistics-Based function, (4) 
Histogram-Based function and (5) Intensity-Based 
function. For an image of size MxN, the notation g(x, y) 
refers to the image intensity at point (x, y), while the 
symbol & indicates the convolution operator. 
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2.1 Derivative-based function 

These functions are based on a derivative suppose 
that defocused 
high-frequency content than focused images. 
(BREN). This 
calculates the first difference between a pixel and its 


images usually have less 


Brenner gradient function 


neighbor two points away"), 
A Fas = 2, Di le, y#2)-e, YP) 
x y 


While |p(x, y+1)-g(x, y)|27(threshold value) 

Laplacian(LAP). This focus function was originally 
used to find focusing errors caused by noise". The 
algorithm implements the image convolution with a 
discrete Laplacian mask as follows: 


AF =>, > [elx-l, y)+g(x+l, y+ 


glx, y-l)+e(x, yt 1-4 e(x-1, y)? (2) 

Tenengrad (TEN). This algorithm convolves an 

image with Sobel operators and then it sums the 

square of all the magnitudes greater than a threshold 
value”, 


AF = >, >} lee, 8S4 


[e(x, VESP, V gt, y)>t (3) 
Where S and S’ are the Sobel's kernel and its 
transpose respectively, where S is given by: 


1 0 -1 
S=|2 0 -2 (4) 
1 0 -1 


Absolute Tenengrad (ATEN). This focus function 
is similar to the previous Eq(3), but the absolute value 
of the gradient coefficients is used in order to reduce 
the computation time”. 


AF =), 2; le(x, NISH, NAS| (5) 


Gaussian filter (GS). This focus function is based 
on a gradient filter derived from convolving the image 
with a first-order Gaussian derivative”. 

AFe(o)= FD Lee, OCA, y, o) 
le, POG, y, o)P (6) 

Where G, and G, are the first-order Gaussian 
derivatives in the x and y directions. The o parameter 
of the Gaussian method should be adjusted in relation 
to the objects present in the image. We evaluated 
different o values to test the robustness of the method. 
Here we set o=1. 

2.2 Transform-based function 

Transform-based functions are utilized to 
calculate the degree of focusing for each image by a 


mathematical transform. 

Discrete Cosine Transform (DCT). Focusing 
techniques based on band-passed filters perform well. 
In this algorithm, images are divided into blocks of 
40x40 pixels then DCT is applied, and the sum of four 
band-pass diagonal bands representing mid and high 
frequencies is chosen”: 

T(2m+l)u | 


C(u, = 2 Lalx, y)eos| IM 


cos| Tpl | (7) 

Midfrequency-DCT (MDCT). The influence of 

the band-pass DCT coefficients on the focus measure 

has been analyzed "^, A 4x4 convolution mask for 

extracting the central coefficient C(4, 4) of the DCT, 

which is used as a focus measurement. The MDCT 
operator can be calculated as: 


A Fct = > BY (o(x, y)® Oum) (8) 
x Y 
With 
1 1 -1 -l 
1 1 -1 -l 
PE e ”) 
-1 -1 1 1 


Wavelet transformation (WT). This function is 
based on discrete wavelet transformation. The wavelet 
focus function calculates ratio of the energy in low and 
high pass bands”. 


A Fogel fof ai 
ILA) I 

Where h,(f) is the discrete wavelet transformation 
in high pass band, and J,,( f) is the discrete wavelet 
transformation in the low pass band. 
2.3 Statistics-based function 

Statistics-based functions use features such as 
variance and autocorrelation to calculate the degree of 
focusing for each image. 

Variance (VAR). This function computes the 
variations in the gray level among the image pixels, 
where bright and dark pixels have the same influence”. 


1 pa 
AP y 2 È, let y-2] (11) 


Where a 5 A £ „g(x, y) is the image mean. 


Normalized variance (NVAR)"!. This function is 
a variation of Eq(11) by normalizing with the mean g, 
which compensates for changes in the average image 
brightness. 


Bal uoy 
A Ewan È 2 [ax, y)-2] (12) 
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Vollath-4 (VOL4)'“. This algorithm proposed 
by Vollath is based on an autocorrelation function. 
M-1 


ARa à, ee, y-e(x+l, y)- 
M-2 


È Det, y)-elx+2, y) (13) 


Vallath-5 (VOLS) l, 
modification of Eq(13) which is based on the standard 


Vollath proposed a 


deviation function. 


M-1 


AFws= >, by A) 2641, )-MNe (14) 


2.4 Histogram-based function 

These functions are grounded on the assumption 
that focused images have a greater number of grey 
levels than defocused images". 
(LOG). 


approximates a probability distribution function of 


Log-histogram Image histogram 
gray levels, where the variance of this distribution 
increases as the image sharpness increases tool", This 
algorithm is based on the bright pixels in the image by 


multiplying the variance by the logarithm function: 
AFi = > [I-E've(1} P*log(p,) (15) 
L 
Where p, is the probability if the intensity level / 


and F\,.{l}= 5 ,L*log(p,) is the expected value of the 
log-histogram. 

Weighted histogram (WHS). Images focused 
under fluorescence illumination present higher 
portions of pixels with bright gray levels than 
unfocused image. This recently proposed algorithm is 
based on a weighted image histogram without 


introducing a constant threshold™!, 


APR is = 2 [VED 15+ 10-5 | (16) 


2.5 Intensity-based function 

Intensity of the image is another feature that 
characterizes the degree of focusing. It can be 
estimated with different ways. 

Power squared (PS). This focus function sums all 
image intensities™!. 


AFs =>, >, ex, y} (17) 


Threshold (TH). This function sums the number 
of pixels above a threshold as follows: 


AFm =), > Tle, y) (18) 
x os 
With 
1 if gx, y)>7T 
Telo otherwise Ga 


We used a fixed threshold at 50% of maximum 
brightness value in the whole stack. 


3 Results 


3.1 Accuracy error and computational time 
Because the Z-axis step is small, it is hard to 
distinguish the best focused image around the focus 
plane for human observers. Different observers could 
choose slightly different images around the focus 
plane. So the assessment of autofocus algorithms 
would be slightly different from different observers. In 
this paper, we adopted an objective way to determine 
the optimal focus plane to avoid subjective deviation. 
Autofocus functions were computed for each stack. 
The focus plane calculated by each function was not 
the same. Therefore, we defined the focus point as the 
point which most algorithms are considered as the 
focus point. We compared the focus point obtained by 
most algorithms and human observers. The results are 
shown in Table 1. Because of the cases are similar in 
both x10 and x40, only the data at x40 is shown in the 


Table 1 Comparison of most algorithms and human observers at x40 


Most algorithms (frame) Human observer! (frame) 


Deviation! (frame) 


Human observer2(frame) Deviation2 (frame) 


Stack/ 32 33 
Stack2 32 32 
Stack3 32 32 
Stack4 34 33 
Stacks 31 32 
Stack6 37 38 
Stack7 30 31 
Stacks 31 30 
Stack9 33 33 


Stack/0 34 35 


1 32 0 
0 32 0 
0 34 2 
-1 33 -1 
1 33 2 
1 36 -1 
1 30 0 
-1 30 -1 
0 34 1 
1 34 0 
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present work. 

The deviation with the focus point was considered 
as the focus error for each algorithm. The errors of the 
algorithms applied to the ten stacks are shown in 
Figure 1. The variance of data was indicated by the 


Error/um 
N 
© 


Tdd mid 


lines drawn above the bars. The stacks acquired 
at different focus points using a constant Z step(AZ= 
0.5 um at x40, AZ=2 um at x10). The error of the 
algorithm (sum of average and variance) less than one 
step indicates that the algorithm has high performance. 


BREN LAP TEN ATEN GS DCT MDCT WT VAR NVARVOL4VOL5 LOG WHS PS TH 


Fig. 1 Error according to magnification value 
O : x40(NA=0.75); W : x10(NA=0.30). 


According to the plots, ATEN,BREN, DCT, WT, 
MDCT, GS,TEN show high performance at x40 (less 
than 0.5 um which indicates 1 frame distance), while 
ATEN, BREN, DCT, MDCT, LAP, GS, TEN, VAR, 
VOL4, WHS show high performance at x10 (less than 
2 wm which indicates 1 frame distance). So ATEN, 


100+ 
90+ 
80 F 
70+ 
60+ 


Correctly focused stacks/% 
wn 
= 


BREN, DCT, MDCT, GS, TEN show high 
performance at both magnifications. Furthermore, 
ATEN achieves the best accuracy of all algorithms. In 
addition, the percentage of correctly focused images 
was computed and show in Figure 2. 


BREN LAP TEN ATEN GS DCT MDCT WT VAR NVARVOL4VOLS LOG WHS PS TH 


Fig. 2 Percentage of images correctly focused 


Computational time and accuracy in automatic 
systems is a trade-off and the algorithms with the best 
ranking in computational time are not necessarily 
effective in accuracy (Figure 3). Autofocus functions 
implemented in Matlab 7.6.0 (The Mathworks, 
America) on an Intel Core i5 3.50 GHz 16 GB RAM 


L1: x40(NA=0.75); Hf: x LO(NA=0.30). 


computer using the Windows 8 operating system 
(Microsoft, America). According to the evaluated 
algorithms, the TH method was the fastest with 
5.28 ms per image and the WT method was the slowest 
with 289.65 ms per image. 
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BREN LAP TEN ATEN GS DCT MDCT WT VAR NVAR VOL4 VOLS LOG WHS PS TH 


© 


Fig. 3 Mean computation time required by each algorithm per image 


3.2 Noise responses 

The performance of autofocus function has been 
evaluated when different levels of noise are added to 
the stacks. We have added increasing levels of white 
Gaussian noise to the original data(Figure 4). Although 


the used values are not the real experiment conditions, 
this test can give additional information about the 
robustness of the algorithms. We calculated their 
influence in the accuracy error and the results for noise 
robustness are summarized in Figure 5. It is noticed 
that most of the algorithms are relatively stable until 


the distortion becomes very large, with the exception 
1Y See P Fig. 4 C. elegans lipid droplets fluorescence 


of WT which demonstrate more sensibility to noise images used in evaluation 


than other algorithms. (a) x40 focused image. (b) x40 focused image with 80 dBW Gaussian 
noise. (c) x10 focused image. (d) x10 focused image with 80 dBW 
Gaussian noise. 


@ ool © 100 


10 


Error/ym 
Error/ym 


Noise energy/dBW Noise energy/dBW 


Fig. 5 Responses of algorithms to white Gaussian noise at x40(a) and x10(b) 
Because the value of ATEN is 0, 0, 0, 0, 0.2 here, it doesn't show in the logarithmic coordinate. e—e: BREN; e-—--e : LAP; v -v : TEN; 4 —-a: ATEN; 
O-—-O: GS; a---@ : DCT; @----: MDCT; @—@: WT; 4-4: VAR; Y --v¥ : NVAR; e---e: VOLA; ©--- © : VOLS; e ---e : LOG;e---e : WHS; — Y : PS;4~-~-a4 : TH. 
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3.3 Accuracy in focus curve 

The reliable and fast autofocusing method is an 
important aspect in the automatization system and the 
shape of the focus curve can play an influential role in 
this aspect. The focus function should be unimodal in 
theory, but in fact it can present diverse local maxima 
which can affect the convergence of the autofocus 
procedure. Moreover, the focus curve should be sharp 
at the peak, which can speed up the convergence of the 
procedure. In order to characterize the autofocus 


3.05 


Y” 
ùn 


Y 
= 
r 


oo 
© 
T 


algorithms more completely, we take into account two 
aspects: the number of local maxima and the width at 
50% maximum of the focus curve (FWHM, full width 
at half maximum). 

First, we can see in Figure 6 that most algorithms 
present a unique maximum except PS. In terms of the 
width ratio of the focus curve, (Figure 7), ATEN, 
BREN, DCT, WT, MDCT, LAP, GS, TEN, VOL4, 
WHS show high performance both at x40 (less than 
10 um) and at x10 (less than 40 um). 


Number of focus maxima 
= 
mn 
7 


> 
ùn 


BREN LAP TEN ATEN GS DCT MDCT WT VAR NVAR VOLA VOL5 LOG WHS PS TH 


Fig. 6 Comparison of focus algorithms in terms of averaged number of local maxima (included the global maximum) 
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O : x40(NA=0.75); W : x10(NA=0.30). 


BREN LAP TEN ATEN GS DCT MDCT WT VAR NVAR VOL4 VOL5 LOG WHS PS TH 


Fig. 7 Comparison of focus algorithms in terms of average FWHM of the focus curve 
O : x40(NA=0.75); W : x10(NA=0.30). 


4 Discussion 


to fat 
a laborious and 


Identifying the gene which related 


metabolism in C. elegans is 
time-consuming task. Therefore, automatic screening 


system, from image acquisition to analysis, will be 


beneficial to the gene identification in C. elegans. We 
have presented here a study of autofocus algorithms for 
automatic detection of C. elegans lipid droplets in 
fluorescence images. 

From the results, most of the algorithms show a 
low accuracy error. Especially, ATEN, DCT, MDCT, 
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GS, TEN 
magnification x40 and x10. So they could be regarded 


show high performance at both 


as suitable algorithms for a C. elegans automatic 
system. Moreover, ATEN achieves the best accuracy 
of all algorithms for C. elegans lipid droplets. In the 
presence of noise, most of the algorithms do not 
change noticeably except WT. These results show that 
most of the algorithms are quite robust and 
independent of noise. In addition, most of the 
algorithms show no false maxima except PS. In terms 
of the width ratio of the focus curve, ATEN, BREN, 
DCT, WT, MDCT, LAP, GS, TEN, VOL4, WHS show 
high performance at both magnification x40 and x10. 
To sum up, in our study WT show a poor 
accuracy, and has a long computational time, so it is 
not suitable for C. elegans lipid droplets, although WT 
may have a good performance in other applications. 
Comprehensive consideration of accuracy and 
computational time, we recommend ATEN, MDCT 
and TEN for C. elegans lipid droplets. Moreover, 
ATEN achieves the best accuracy. In an automatic 
screening system, we often require both high accuracy 
and fast acquisition. In this case, we can apply the 
fastest algorithm TH for rough search, and then apply 


ATEN algorithm for fine search. 
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RY it o? 


(SEB KE aT BLS SAS be, R 430074; ? PARRET, Abst 100101) 
HE ASN SCA ASIEN — PS ER, FICE ARS, ERREA EA E AR 
ALTE NT EVE UT eR AO 2k ES A BETIS, FBI AMAR Ae FET EA. ZEAE, OT 16 AY A oh ee 
TDA BB GE HY EAT TVET, TLE ER et Ae A Ye, MA A A, 
MEARS. TAIT RR. TEIN TA. DREJ HE RERET TOT, AERA, Ke BT Be A 
BREA Bet AEH, FEI AEA Tenengrad Sh: fe RE A I EL, BAT RE KG EY A BE A A 5e i 
ae RAP. 
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